UNITED STATES 

NAVAL POSTGRADUATE SCHOOL 



APPLICATION OF STATISTICAL METHODS TO 
NAVAL OPERATIONAL TESTING 

by 

Alwyn Smith, Jr. 

Lieutenant Commander, United States Navy 





APPLICATION OF STATISTICAL METHODS TO 
NAVAL OPERATIONAL TESTING 



by 

Alwyn Smith, Jr. 

A 

Lieutenant Commander, United States Navy 



Submitted in partial fulfillment 
of the requirements 
for the degree of 
MASTER OF SCIENCE 

United States Naval Postgraduate School 
Monterey, California 






1956 



Ji’Tte Sc'ioo* 



APPLICATION OF STATISTICAL METHODS 

TO 

NAVAL OPERATIONAL TESTING 



* * * * 



Alwyn Smith, Jr 



This work is accepted as fulfilling 
the thesis requirements for the degree of 



MASTER OF SCIENCE 



from the 

United States Naval Postgraduate School 



PREFACE 



This thesis is concerned with the application of statistical methods 
to naval operational testing, and more specifically, with testing of the 
type conducted by OpDevFor and similar testing agencies. ‘The statis- 
tical methods considered are those of confidence limits, sequential 
analysis and the more recently developed Statistical Decision Theory, 
These techniques are regarded as of interfe^t to Naval line officers in 
various categories of billets, particularly Project Officers at testing 
agencies and officers concerned with planning which is based on the 
results of testing programs. Frequently, such officers are unfamiliar 
with the use and limitations of the three methods and the relations of 
these methods to each other. 

Statistical Decision Theory, in particular, can be useful at higher 
levels of the naval establishment such as the offices of CNO. Naval 
planners at this level may be faced with a difficult problem in connec- 
tion with testing programs whose objective is the estimation of the per- 
centage effectiveness or probability of success of a weapon, in future 
combat. Such programs are costly to conduct, and increasingly so 
when an expensive weapon is tested to destruction. Costs of a different 
nature are those associated with the possible consequences if a poor 
estimate of the weapon's effectiveness is obtained from tests. The fun- 
damental problem is to determine how many trials are to be conducted 
and hence how many weapons should be tested. Attempts to solve the 
problem by reconciling the conflicting costs will generally lead to a 



11 



dilemma. The application of Statistical Decision Theory to this problem 
is contingent upon the ability of the planner to specify the inputs or data 
required by the theory. An essential objective of this thesis is to show 
how a planner might be guided in specifying these inputs. 

The thesis has been written with a view towards its usefulness for 
personnel with a minimum background in probability and statistics. It 
is addressed also to students and practicioners of Operations Analysis 
who may be concerned with the relations of the three statistical methods 
which are considered as tools which may provide quantitative basis for 
executive decision. 

The writer 1 s interest in the possible applications of this theory was 
aroused during the study of Statistical Decision Theory at the U. S. 

Naval Postgraduate School. The need for investigations as to how the 
inputs could be specified was pointed out by DT*. R. A. Tucker, USN in 
his thesis: An Introduction to Statistical Decision Function s [ 1 ] . 

Tucker’s paper presents detailed discussions of the mathematical con- 
cepts involved in the theory and the precise mathematical steps 
required to obtain the solution. It is intended for readers with less 
mathematical background than is required for an understanding of the 
basic work Statistical Decision Functions by Abraham Wald [2] . 

Readers who are interested in the theory and detailed computations 
should refer to Wald and Tucker. 

This paper is divided into five chapters. Chapter I discusses the 
relationship between types of operational testing problems at various 
levels of the Navy. Chapters II and III are examples of applications 



iii 



of statistical methods at the testing agency level. In Chapter IV, an 
example of a guided missile is used as a vehicle of discussion as to how 
the inputs required by Statistical Decision Theory may be specified by 
an office of CNO. A solution to the example is then given. In Chapter 
V the effect of variation of parameters is shown. 

This thesis was written at the U. S Naval Postgraduate School, 
Monterey, California, during the period January-May, 1956. Iam 
indebted to Professor Thomas E. Oberbeck for his continued patience, 
encouragement and most capable guidance while acting as faculty advi- 
sor; and for permission to use solutions he has obtained by program- 
ming the problem for an electronic computer. I wish to thank Professor 

C. A. Magwire for his valuable assistance as second reader, and Mrs. 

D. P. Slingerland for her meticulous preparation of this typescript. 
Appreciation is also expressed here to personnel of VX-4 and the U. S. 
Naval Air Missile Test Center, Point Mugu, California, who provided 
much helpful information on the practical aspects of testing. 

The graph on page 8 is reproduced from Burrington and May’s: 
Handbook of Probability and Statistics by permission of the publishers, 
Handbook Publishers, Inc. , Sandusky, Ohio. 



iiii 



TABLE OF CONTENTS 



Item Title Page 

Chapter I Introduction 1 

Chapter II Confidence Intervals 4 

Chapter III Sequential Analysis 10 

Chapter IV Statistical Decision Theory 25 

A. Example and Specification of the Inputs 2 5 

B. Solution of the Example 3 8 

Chapter V Variation of Parameters 44 

Bibliography 49 

Appendix A Construction of the Graph for Sequential Tests 50 



IV 



LIST OF ILLUSTRATIONS 



Table Page 

1. Table of Test Results 16 

2. Solution of Example, Ch. IV, (Linear W) .... 40 

3. Solution of Example, Ch. IV, (Simple W) . . . . 43 

Figure 

1. Confidence Interval Chart 8 

2. Power Curve for Case A 14 

3. Graph of Sequential Test 15 

4. Power Curve for Case B 17 

5. Power Curve for Case C 18 

6. Sequential Test for Case A 20 

7. Sequential Test for Case B 21 

8. Sequential Test for Case C 22 

9. Uniform Probability Density Function 2 7 

10. Triangular Probability Density Function ..... 27 

11. Cases Leading to Wrong Decisions 30 

12. Penalty associated with d^ 33 

13. Penalty associated with d^ 34 

14. Linear Weight Function 3 5 

15. Linear Weight Function 

Showing the Parameter 3 36 

16. Simple Weight Function 3 7 

17. Plot of Maximum Number of Trials vs. 

Cost of Testing. (Linear W) 45 

18. Plot of Maximum Number of Trials vs. 

Cost of Testing. (Simple W) • 46 



v 



TABLE OF SYMBOLS 
(Listed in order of their use in the test) 

an estimate of the parameter, p 
p a < . parameter value (probability of success) 

x a variable representing the outcome of a trial 

n the number of trials in any testing program 

L^ the lower limit of a confidence interval 

L^ the upper limit of a confidence interval 

a a confidence coefficient 

P n specific values of p (n = 1, Z) 

the maximum risk of rejecting an acceptable weapon 
fi the maximum risk of accepting an unacceptable weapon 

s the number of successes in a series of trials 

f^(p) uniform probability density function 
f^(p) triangular probability density function 

0 a particular value of p which characterizes the terminal 

decisions 

d^ the decision to convert (terminal decision) 

the decision to develop (terminal decision) 

W a weight function 

9 a parameter which characterizes W 

C cost function 

c ' the cost of one trial 

the minimum number of trials required 

the maximum number of trials required 

(i, j) coordinates representing the number of failures and successes 
observed 



vi 



CHAPTER I 



INTRODUCTION 



This thesis is concerned with the application of some statistical 
methods to testing programs of the type conducted by OpDevFor or 
similar agencies, operating directly for CNO. Such programs are 
defined as naval operational testing. The thesis does not consider 
applications of the se .methods to quality control or to development or 
engineering tests such as those which might be conducted by the mater- 
ial Bureaus . 

The scope is further limited to those operational testing pro- 
grams in which the object is to obtain an estimate of the probability 
of success of a weapon or weapon system, in future combat. As the 
phrase is used, probability of success can be thought of as generally 
equivalent to hit probability, percent effectiveness or reliability. The 
term testing program is used to describe any test which consists of a 
series of independent trials conducted to obtain the above estimate. 

The estimate is regarded as providing a quantitative basis for execu- 
tive decision at some level of the Naval organization. 

A fundamental problem in any testing, program is the determination 
of the number of trials to be conducted. For some programs this can 
be difficult' , depending es sentially on the magnitude of various "costs". 
It is clear that the number of trials is related to cost. The reconcilia- 
tion of all the costs involved is sometimes difficult. This concept is 
well phrased by Breakwell [3] . In a sentence taken out of content: 



1 



A balance is sought between (1) the cost of testing for reliability 
and (2) the risks, because of limiting testing, of either accepting 
an insufficiently reliable product or rejecting a sufficiently reli- 
able one. 

As (1) increases the natural tendency is toward fewer trials; as (2) 
increases the tendency is to desire more trials. Thus, if a balance is 
to be obtained, the statistical methods used must explicitly relate the 
"cost 11 , the number of trials and the decisions to accept the product or 
weapon. Statistical decision theory provides a rational basis for attack- 
ing this problem but it can only be applied when these costs can be spe- 
cified, Frequently, these costs cannot be estimated by the testing 
agency. Under these circumstances, the costs must be furnished to 
the agency or the agency is compelled to apply a statistical method which 
is not based on sUch estimates. Chapters II and III are devoted to such 
methods. 

Line officers who are unfamiliar with statistical techniques may 
often be directly concerned with testing programs. Chapter II serves 
to introduce the technique of confidence intervals, which are often used 
in reporting test results. It points out that this technique does not pro- 
vide the planner of the testing program with adequate guidance as to 
how he should specify the numbe r of trials. Chapter III illustrates 
the use of sequential analysis. It is considered applicable to testing 
programs which are conducted to compare improved weapons with an 
existing weapon. 

Testing programs which involve new weapons such as guided mis- 
siles, where the costs in (1) and (2) are high, provide a possible field 
for application of statistical decision theory. It is considered that the 



2 



cost estimates required may be available to naval planners at the CNO 
level. Chapter IV illustrates the planner's role in the application of 
this theory to the problem of testing a guided missile. This chapter 
can be read without reference to preceding chapters, if desired. 



/ 



i 



3 



CHAPTER II 



CONFIDENCE LIMITS 



Consider the problem of a fleet testing agency in estimating the 
effectiveness or usefulness of a weapon or weapon system. For exam- 
ple, a destroyer which is equipped with a new or improved anti-submar- 
ine weapon. In order to express the effectiveness of this weapon system 
quantitatively, some measure must be used. Suppose the measure cho- 
sen is the percentage of hits achieved by the system in a series of inde- 
pendent trials. This measure is regarded as an approximation of' the 
percentage of hits which will be achieved by the system in a future war 
but the actual percentage of hits, or the true value of p as it will be 
called, is an unknown quantity. It is assumed that this true value can 
be estimated by suitable testing. This estimate will be designated by 
P 

^e 

Assume that the naval planner has a testing program which simu- 
lates as far as possible the combat conditions under which the system 
might be used. Also, that the number of simulated attacks or trials will 
be fairly large (50 or more) and will be conducted so that they represent 
a random sample of observations. Further assume, that the number of 
trials to be made is fixed by limitations over which the planner has no 
control. 

It is intuitively apparent that the accuracy of the estimate, p^ , 
will depend upon the number of trials conducted; the larger the number 
of trials, n , the greater the accuracy of the estimate will be. If we 



4 



designate the result of a trial by x , then we may consider that x can 
have only two vaLues; x = 1 for success or hit, and x = 0 for a fail- 
ure or miss. The estimate p may then simply be the total number of 

e 

hits divided by the total number of trials. However, this estimate may 
not precisely represent the true p , therefore, a measure of the possi- 
ble uncertainty in p^ is desirable since the test result will be used as 
a basis for making statements about the true p of the system. The 
use of confidence intervals, or limits, provides this measure. This 
technique is best illustrated by an example* 

Suppose 50 trials of the system have been conducted and 15 hits 
1 5 

scored; thus p^ = g-g- = . 30 * What statements can the planner make 

about the true value of p ? By the statistical method known as deter- 
mining confidence intervals, two limits, say L.^ and L.^ r can be 
computed, he can then say that the true value of p lies in the interval 
between these limits, but he can make this statement only with some 
arbitrary degree of assurance that it is correct. This degree of assur- 
ance or confidence is expressed by a confidence coefficient a . Its 
value depends upon the degree of confidence the planner desires to have 
when he makes the statement that p lies in the interval L., to L. ^ . 
If he wants to be 95% certain then the statement would be 
(A) Prob (B, < p < = *95 

where = . 17 and = . 43 , for this example. This expres- 

✓ 

sion should be read: n The probability is . 95 that the variable limits 

and L.^ include the true value p between them n . This implies 
that there is a 5% chance of being wrong and that the true value of p 



5 



might be outside this interval. A similar statement could be made with, 
say 99% confidence (one chance in 100 of being wrong), but if this degree 
of assurance were demanded the effect would be to spread the limits Lj 
and L.^ farther apart, that is, (. 15 to .49) * Thus, the planner would 
be more assured about the truth of his statement but at the same time 
less certain of the value of p 

As Mood [4] points out, (A) should be carefully interpreted because 
it appears that p is a variable when actually it is not, p being a fixed 
value, the true hit probability of the weapon. The variables are L.^ 
and • With this fact in mind, (A) has the meaning that we are 95% 

certain that the interval formed by L. ^ and includes p . 

and are used to represent the following variables: 




From (A) it can be seen that the limits L< 1 and are functions 

of p , a and n , that is, in functional notation 
e — 

L 1 = L l^ P e’ 

L 2 = L Z^ P e' - ^ 

These relations show that the limits depend upon the outcome of the 

test, p^ , the number of trials, n , and the confidence coefficient, 

a . Thus, as pointed out above , if p = . 3 , n = 50 and a = -.95, 

— e — 

then and are determined as . 17 and .43 , respectively. 

The planner may feel that the confidence interval .17 to .43 , 



o 



computed on the basis of p^ = . 3 from 50 trials, is too large and that 
if more trials had been conducted he could have located p within nar- 
rower limits. If he had obtained the same outcome of p = . 3 on tests 
which had consisted of 50, 100 and 1000 trials, respectively, the follow- 

ing Table illustrates the change in the confidence interval as n and a 
are changed: 



No. of trials 


Approx. 


95% limits 


Approx. 


99% limits 


50 


. 17 


- . 43 


. 15 


- .49 


100 


. 21 


- . 40 


. 19 


- . 43 


1000 


. 27 


- . 33 


. 265 


- . 335 



Figure 1 is a chart which illustrates the dependence of the confi- 
dence interval on n and p . For a given value of p , a vertical 
line intersects two curves corresponding to a given value of n . These 
intersections, projected on the vertical axis, are the limits and 

L^; and the interval formed by these limits spans the true value of p . 
Hence, it is labelled as the p axis. This chart is for the confidence 
coefficient of . 95 and clearly shows the effect of increasing n 
Also the number of trials, n , may be regarded as a function of the 
interval [L.^ - ] , a and p^ . That is^all three quantities 

must be known in order to determine n . In functional notation. 

n = n(L »2 - , a , p^) 



A 




Confidence Coefficient a = 95 

Figure 1 

It should be noted that the curves are not very useful for attempt- 
ing to determine, in advance, the number of trials required to give a 
fixed confidence interval of desired length because this would mean the 
outcome of the testing program, p^ , would have to be known in 
advance. 

It is apparent that the use of confidence limits has a place in naval 
testing as a means of stating the results of a testing program in a pre- 
cise manner which is more meaningful than simply stating the outcome 
of the program as a single number p^ . But the use of this technique 
does not provide adequate guidance for advance planning to indicate how 



3 



many trials should be run or what degree of confidence should be stipu- 
lated. This technique is based on a fixed number of trials. The planner 
may have chosen the number, in advance, from considerations of time, 
services required, etc. , but we wish to emphasize that this theory does 
not provide any criteria upon which the planner can base such a choice. 



9 



CHAPTER III 



SEQUENTIAL ANALYSIS 

Sequential Analysis was developed by A. Wald in 1943 for use on 
problems which arose during World War II. It was widely used in manu- 
facturing establishments for acceptance inspection of "lots" of mass 
production items. The detailed application of sequential tests to such 
problems is given by the Statistical Research Group [5]* The principal 
advantage of sequential tests in acceptance inspection is that it reduces 
the amount of inspection required. As shown in [5], the methods of 
sequential analysis can be applied to experiments. Since certain types 
of fleet testing problems can be thought of as ’’experiments”, the appli- 
cation of sequential tests to a testing program will be described. 

Sequential Analysis can be used when a testing program is to be con- 
ducted for the purpose of comparing the hit probability or the probability 
of success of a supposedly improved weapon system with that of the exist- 
ing system. Testing of this type may be < indicated when it is desired to 
use the test results as a basis for decision to recommend acceptance or 
rejection of the modified system, for fleet use. 

Sequential analysis does not permit the exact number of trials 
required to be determined in advance, however, an average or expected 
number rnay be calculated. From a naval planning standpoint, ignorance 
of the total number of trials required may pose some problems for sche- 
duling, determination of material requirements, services and related 
details. This maybe a disadvantage, but the use of sequential tests can 



10 



result in a possible economy of trials required to reach a decision. This 
economy may represent considerable savings of time and services. 

In order to use Sequential Analysis, the planner must be able to spe- 
cify certain quantities or inputs. The following hypothetical example will 
illustrate these inputs and how they might be specified. The example will 
be similar to testing problems of the type faced by fleet testing agencies. 
The tost results, which might have been obtained for this hypothetical 
example are shown in Table 1. 

EXAMPLE: A destroyer equipped with a supposedly improved anti- 
submarine weapon is to be tested to determine the hit probability of this 
system (designated as system II). The hit probability of system II is to 
be compared with the known hit probability of otherwise identical des- 
troyers employing a weapon which has been in service use (designated as 
system I ). We shall assume the hit probability of system I to be . 2 . 
Also, we shall assume that the cost of the two systems is approximately 
the same. 

Assume finally, that it is desired to specify a sequential test. The 
outcome of testing will indicate whether to accept or reject system II as 
being better on the basis of the sample of trial runs. Therefore, the 
inputs of the Sequential Test must be carefully specified. These inputs 
uniquely define the sequential test: 

(a) p. - The hit probability which would make the new system II 

1 1 Unac c eatable” . 

(b) p^ - The hit probability which would make the new system II 

"Acceptable” . 



11 



(c) - The maximum allowable risk or probability of rejecting a 

new system II if it h^ ^ hit probability p^ or better. 

(d ) yS - The maximum allowable risk or probability of accepting 

system II if it has a hit probability of p, or less. 



c 0 _ T*r- 

— .JC i.. 

input, p 7 

i. 

hit probabil 
^eems logic 
better than 
than that of 



r ICATIGX OF , p - , , 3 1 In this example, the first 

, is probably the easiest to select. Since p., is the 
.ity of a new system which would make it unacceptable, it 
:al that any new system would not be desirable if it were no 
the existing one, taat is, if its hit probability was no better 

y 

system I, which we have assumed to be a pout . 2. Hence 



s et p , = . 2 

The second input, p^ might be selected by reasoning as follows: 
At first thought it would appear that any system with a hit probability 
greater than that of existing systems is "acceptable 11 ; however, since 
the test will consume time and money it would not be logical to accept 
a system with - hit probability only slightly greater than existing sys- 
tems; say, p between . 2 to .3 . On the other hand it might be 

argued that the new system should be at least twice as good as the old 
to justify expense of conversion, that is, an increase of p to .4 
would justify the expense of tests and installation of the new system if 
it were accepted. 



The inputs and p represent the probabilities of making a 

wrong decision and these risks are unavoidable „ The values of O' 
and p are small and not necessarily equal. 

is the probability of rejecting the new system if is has a hit 



12 



probability of 



4 



Since it would be undesirable to reject a new system 



which isj on the average, twice as good as the existing one, then the pro- 
.babiiity of making such an error should be made very small. If a risk 
of one chance in 100 can be tolerated then would be * 01 . 

■S , on the other hand, is the probability of accepting a new system 
if it ha ^ a hit probability p. . Since it is possible for the test to lead 
us to ~uch a wrong decision it would mean that we were accepting a new 
system which was actually no better, perhaps worse, in terms of hit 
probability than the existing one. Hence, we want to make the probabi- 
lity of making such an error small also. But, we can tolerate a greater 
risk of this error than we can of rejecting a better system, so f3 can 
be rm.de larger than . In other words, accepting system II when 

it ha* the same hit probability as the old system is not too serious from 
a military standpoint. Therefore if /3 were selected as . 10 , this 
would be taking one chance in sen of making such a wrong decision. For 
illustrative purposes , choose G\ = .01 and ft = . 10 

Since either error is possible, it would seem desirable to have the 
risks, Q\ and fb , of making such errors as small as possible. 

That is, make and (b even smaller than the values chosen above. 

It will be seen that demanding smaller risks will result in having to make 
a greater number of trials and this number can become unacceptably 
large; therefore some risk must be mle rated to avoid a prohibitive num- 
oer .rials. 

Having selected the inputs 

P, = .2 p 2 = .4 c* = . 01 (3 = . 10 



IS 



it is mathematically possible to determine what is defined as a Power 
Curve, 7T (p) . This curve will represent the probability of rejecting 
system II as a function of p . It will have the following general shape: 

ir(-p) 



A 




Power Curve for Case A 
Figure 2 

The ordinate at any point p = p 1 , represents the probability, 
t; ( p 1 } , of rejecting the new system when its hit probability is p 1 
At p = ?? = .4 , tt (p) has the value of OC = . 01 , ^ which is the 
probability of rejecting system II when its hit probability is .4 .At 
this point we are taking one chance in 100 of rejecting system II when its 
true p is equal to .4 . Note that if system II has a p > . 4 , we 

have a still smaller probability of rejecting it. 

At the point p = p^ = . 2 we have a very high probability of reject- 
ing the new system and since (one minus the probability of rejection) is 
equal to the probability of acceptance, then [1 - tt (p^) ] = ft = .10. 



14 



This is the risk we are willing to take in accepting the new system when 
it is no better than the existing one. 

Between p. and p~> the new system will be rejected with probi- 
lities varying from 1 - to ct - the probability of rejection 
decreasing as we approach the "acceptable 11 hit probability of .4 

The Power Curve need not actually be produced in order to make 
use of sequential testing. 

USB OJ THE TEST The four inputs, p. , p^ , o( , are 
used to construct a graph which is the oasis of the sequential test: 




The graph is easy to use. Assume that the results given in Table 1 are 
being obtained from a testing program where the outcome of each trial 
is denoted as success or failure. After each trial, the total number of 
successes is plotted against the tota'^ number of trials conducted thus 
far. As trials progress the plotted point will either fall in one of the 
shaded regions labelled Accep: System II and Reject System II; or it will 



15 



be in between the parallel lines. Trials are continued as long as the point 
remains between the parallel lines but eventually one of the shaded regions 
will be reached. It is this uncertainty as to’when one of these regions will 
be reached that precludes advance determination of the number of trials 
r ecu: red. 



Tne Gu'cails of constructing the straight lines which comprise the 
boundaries of the three regions is given in Appendix A . 

Figures 6, 7 and 8 are constructed using :he same te'st results 

given in Table 1, for three different sets of input parameters, and willbe 
described as case (A) p (3) AND (C). 



Trial number 1 
Outcome F 



1Z 



23 



2 




4 


5 


6 


7 


o 


9 


10 


11 


F 


S 


Y 


S 


S 


F 


Y 


S 


2 


F 


13 


14 


15 


16 


1 7 


18 


19 


20 


21 


22 


■* 


F 


S 


TT 


F 


S 


S 


F 




F 


24 


25 


26 


27 


28 


29 


30 


31 


32 


33 


F 


F 


F 


F 


F 


S 


S 


c 


S 


S 



Table of Assumed Test Results 
Table 1 



CASE A 

Figure 6 illustrates the results that would have been obtained using 
the values 

Pj = .2 p 2 = .4 c* - .01 /3 = .10 



16 



Note that acceptance of System II as having a hit probability .4 or 
greater, would have occurred at the 19th trial where the plotted point 
crossed the boundary into the acceptance region. 

The Power Curve for this particular set of input parameters has 
been shown earlier. 

CAST 3 

Input parameters: 

?- = .2 P 2 = <* = . o: = . 05 

Here, the power curve will appear much the same as for CASS 

(A) but note that p is now smaller. 

/ 



7T (p) 




Power Curve for Case B 
Figure 4 



Decreasing ^ from* „ 10 in Case( A) to the value of . 05 in this 

case, is equivalent to saying that the planner wants a smaller risk of 
accepting System II if it is no better than System I , A decrease in the 
risk might be considered necessary because of the economic aspects, 



17 



such as the desire not to take too great a risk of charging to a new sys- 



tem which is in reality no better than the existing one, if the change 
represents considerable expense. This is eouiv ent to desiring more 
protection against the risk of wrong decision and will result ill more 
trials oe mg rccguired. 

Tigure 7 shows the test data plotted. The boundaries are different 
from Cu^e (A), as indicated by the equations of the lines. 

Note, the acceptance of System II results at the 33rd trial as oppos 
to the 19th trial in Case (A). This increased number of trials is the 
result that was anticipated. The number of trials, for the smaller risk 
& , is greater. 

CAST C 



Input parameters: 

P, = .2 p, = .6 <* -• . 01 7 £ = . 10 

Before discussing test results in this case, it is important to note the 
implications of setting p~ - * 6 

The power curve will have the following general shape 







IS 



Her^, p, and £\ are tire ^ame in Case (A) but the planner in now 
saying he \/ill accept the new weapon on the assumption p~ = . 6 and 
s;ill take the same risks C\ -- .01 of rejecting the new system for 
p = . 6 . Note from the pow^r curve that if the true p of System II 

is in the neighborhood of .4 or . 5 he is taking a greater chance of reject 
lag System II than in Case (A) . This is just another way of saying that 
he is not too concerned with the interval between . 2 and . 6 so the 
test will reject systems which have a true hit probability in this region 
with a higher probability than in Case (A) . Or, System II must have' a 
higher hit probability, on the average, than in Case (A), to be accepted 



by this test. 

Figure S shows the result of the test data. The boundaries are ' 
again different than those of Case (A) and (3) 0 

Note, that on the 28th trial the sequential test indicated rejection? of 
System II as not having a hit probability of . 6 or greater. 

Is is important to recall that in any of these three cases there is 
always the chance that a wrong decision will be made . We have tried 
to keep the chance or risk of such eventualities small, namely by keep- 
ing D\ and (2> small, consistent with a reasonable number of trials. 
A demand for less risk of wrong decisions will result in more ’trials 
being required. Further, and /3 had to be selected on what 

might be called an intuitively reasonable" basis. The sequential test 
does not provide any means whereby the cost of testing and cost of 
wrong decision can be reconciled. 



19 




20 



NUMBER OF TRIALS 















■ 

r <r 



t_l 



TT 



i L 



-H 






- o ' • 



'O-f 

: L 



* — 






'! — 



I , I 



(-} — 


M t i 


• rrrrr 




i rrvTTTi T "P — itt" 


~" r— i i 1 




1 ; J M 


! : | 1 


! | , M ■’ - V| 1 i M 


Li 1. 1 V . i i 1 1 . . J 


i , i i 




TT : 


1 1 1 i , 


illl I 1 


m i I . p 


i 


m 




rr , n , • i ' 


i i 1 i i . M 


*i 1 ■ ! ' j M 1 « 


| | 


M - 


i ii 


! 1 1 


1 --1IM 


' i 1 1- ; 


1 < 1 - 1,1 • M 1 - 1 1 ! 


j 1 


: . i ! J I 1 • ! . 1 


! I 






nil 


L r i i 


■ 1 uj 


I j ■' . ini 


i 1 


• i ! i 


i , i 


- !-m 


It,' 1 1 


11 - » 1 1 


i 1 1 ! \ 1 : l ! L_!_J 


i 1 L 


' _L_1 


, j 


! LI- 1 ! ' ' • ■ ■-L:_|J-J i _i_U __u 


' i 


1 ' 





I I 



M 



* r 
"V <M 

"t 



1 l M i 



i , I 



oO 



II ! 



±£-^ rL: 



tTt 



J-Ul 



* ; 



LB: 



It 



J-i. I i i |JJ_ _Q 



OJL 



T=FP-~ 



i I ' i 



EEEdr“i “35 

_U_ i * ■ 



' . 1 ! .. 






m ' Mjnr~ 




i nr-._ . 






. _| v | . 




! . 


M ' | 






Li 1 ! i I'TT 






_ .ML _ 






Illl 1 VI i 




1 , 


. U ! 


i ; ' ' 'JLAJ 




1 1 !' 



o 

0 



o 



i l L. 

xm 



TT 

TT 



LlC. 

i ( v i 



I i 



i 11 1 ! ^ 

:_iTn±Lj_*: 



i 

T 



G 

W 

co 

< 

O 



O 

V O 



n; ... 



rs. O 



rr 



— r 



i ■ i 



i_!_ I 



TT 



i i 



i o 
Vco 



I I 



I I 



i_m 

I- i i 

Ml 



> i I 



‘ < L 



IQIIC— Iiiiciiz’di- cicrctr^: 

rq . n r , j i j j_ ^ tu i | '{T 

piiiLt 



___ L J_T 

j. i S? i MiT 



::::: 



MT 






i » 1 't ii* 

i_ u' 1 ’ 



.tn 

‘<\2 



m 



» l L 



i l < 



T — 



MM 



-M=F- : 



— Hr 



T L 






! * M I M. i 



I . I 



ZlZZL 







: i I 



n 


} j } ; , 


i i • 


r 


i i ; 


rr 


I j 


ill J ' 




* i | 


Ll 


1 


__U 


D ■ ! l M 




— \ — ! 




J U 1 _L_ 


| | < "| v 1 






IT 







2 \ 



NUMBER OF TRIALS, n 




GO 

>4 

$ 

Pi 

H 

O 

W 

CQ 

2 

D 

2 



zz 



.ii J - » C *, - _ 









A, 3 and C , it should be noted that the 
terrm acceptance or rejection are tac with reference to assumptions 
or hypotheses about the new system. In other words, a statistical. 
tsSt the test of an assumption. In casco A and 3 we tested the 
hypothesis that System II had a hit probability of .4 and as a result 
of ths tests v/e accepted this hypothesis. In Case C , rejection of the 
System II was indicated. 

There could be reasons for deciding not to use a new system 0 ven 
if the statistical results indicated that it was desirable,, For example, 
the new system may be too heavy, or useless in rough seas, or require 



excessive maintenance, li many such factors weighed again 



o‘ nrV' 



new 



’system, obviously it would not be acceptable. 

2. The example given is just one illustration of the use of sequen- 
tial mots. They are not limited to trials to determine hit probability 

^ 3 in cm ini distributions} but could also be used in trials where, for 
example, the miss distance between weapon and target is being me a - 
.ura. Y/hile the details of the example used in this chapter would not 
apply to otch a test, the principles are the same. See [5] . 

3. It has been mentioned that advance planning maybe handicap- 
ped because the exact number of trials cannot be specified but this fact 
should not preclude the use of Sequential Analysis. In many situations, 
the knowledge of the expected number cl trials required may be suffi- 
cient for planning. The great improvement of Sequential Analysis over 
the classical test procedures, namely, the rouuction in the number of 



23 



vT-o, im^c-_roa oO r e < — C 11 <n. <^eC. 


iuio».) 4-».*.«.o u_s— . not s e ovc rloo.\^u. 


-• --- - ----- — • ar-d 


3 mre inputs to .the problem which are 


e e. ~~^en o iiOr^.^C cvlle. *n..c. 


nry considerations. The concept of cost 


was implied when a value of a. 


m probability of an "acceptable 11 system. 


p 9 , v/a 3 chosen. Als o . v/hc 


n the c or re s ponding risk of reject- 


b-g sued a system is specified 


there is an associated idea of cost v 


Turther, there is the cost of t 


ostirm; prop ram which might include ser- 


victo, mater— A cost of me \v 


eaporm expended in test, etc v These 


costs are mentioned to point c 


ut that, although the planner considers 



then" when specifying the inputs, the theory does not permit the plume 



to explicitly take such costs it. 


-to account,, 


-r* mo r e gene r a 1 tne o r y is 


described in the next chapter. Explicit 


expressions for cost of testing 


; and costs of wrong decisions provide a 


me a no for determining, prior 


to testing, the number of trials to be 


constctcc 4 





24 



3. xV 



U* ft- — — 

, ^ N ^ — . <J+L "l 



Th^ 0 eneral theory r;ey be useful to naval planners wt the CNO 
level. In order to discuss the application of the theory it mmns appro- 
priate to describe a situation giving rise to a Statistical Decision 



- J ro. 



(-) 



1. A need for an Air - to - Air Guided missile is recognized and 
translated to an Operational Acquirement by an office of CNO 

2. The Operational Aequir ement passes to the cognizant bureau. 
--- contract A let and a missile Is produced. At this point testing by 
sur^t and contractor is invol\ jd, but this paper is not concerned with 
these -.ns, At some future time the bureau will inform CNO that a 
missile has been produced to satisfy the Operational Requirement. 

3. Is is assumed that CNO now desires an estimate > o , of the 

e 



11 C ^ . 



, cffectivems s s or probability of success, of the new missile 
i: sum combat. In order to obtain this emimate, missile testing by 

_ c m e ..^ot agency, such as OpDevFcr, _s indicated. An office of CNO 
must decide how many weapons should be tested (that is, how many 
trials should be conducted) so that tine testing agency may plan the test- 
ing program. This question of the number of trials required consti- 
tutes the Statistical Decision Problem. 

Statistical Decision Theory provides a rational basis for answer- 
ing such questions, moreover, the answer is provided prior to testing. 



25 



which .s wuVu;.t w ^ 0 Uw from fhc point of view of planning* In th*s chap- 
ter v/o ~lic.ll oxumim cho form cl the wiauwer provided by this theory in 
a .podw. s~t of circumstances which \/ih be related to the problem posed 
in the abov^ example. This special set of circumstances includes: 

(a) In the trials from which the ^mlmate p is obtained, the out- 

e 

cone of any trial muss not be affected or influenced by the out- 
come of preceding trials, i. o. , the trials must be statistically 
independent* 

(b) Probability of success of the missile 10 defined as the probabi- 
lity of successful launching, guiding and detonation of the mis- 
sile at th^ intended point. In actual combat there will exist 
some value for this probability of success, which we will call 
the true value of p * This value is an unknown quantity but 
can be estimated by testing* 

(c) The outcome of each tr al is valued as one for success and 
zero for failure. Note that failure here is not concerned with 
which of the three phases of (b) fails* In any series of inde- 
pendent trials, an estimate p _ of the true probability of suc- 
cess p is given by the number of successes divided by the 
toml number of trials. 

When these special circumstances are fulfilled, then an answer is 
given in the form of tables presented at the end of the chapter* The use 
of these tables requires a specification of certain input o to the problem. 
These inputs will be explained m the light of the hypothetical testing 
sroblem. 



so 



mpur 

Ukm -* am concevv that the true value of p , since 
a; is w probability, can only take on values between zero and one* Hence 
it can be visualized as any point in the interval L C, 1] . In this exam- 
ple -v : s assumed that all vU*w^ ol p in this range are equally likely* 
Thiw merely says that the planner has no a priori knowledge favoring 



,omo valuer of the true 



hoove others. In technical language, he 



Ims opecified a uniform probability density function, f, (p) , for p 

G r a ph ically : 









% — — , ^ w 

o :.o ' 

Uniform Probability Density Function 




On the; c:her land, if the planner had some a 
\/hich would lead him to favor certain values of 
c probability density function v rich reflects this 



priori knowledge 
p , he might specify 
knowledge, ouch as: 




~ig--2 10 



27 



